22 research outputs found

    CNN-LSTM Architecture for Action Recognition in Videos

    Get PDF
    Action recognition in videos is currently a topic of interest in the area of computer vision, due to potential applications such as: multimedia indexing, surveillance in public spaces, among others. In this paper we propose a CNN{LSTM architecture. First, a pre-trained VGG16 convolutional neuronal networks extracts the features of the input video. Then, a LSTM classi es the video in a particular class. To carry out the training and the test, we used the UCF-11 dataset. Evaluate the performance of our system using the evaluation metric in accuracy. We apply LOOCV with k = 25, we obtain ~ 98% and ~ 91% for training and test respectively.Sociedad Argentina de Inform谩tica e Investigaci贸n Operativ

    CNN-LSTM Architecture for Action Recognition in Videos

    Get PDF
    Action recognition in videos is currently a topic of interest in the area of computer vision, due to potential applications such as: multimedia indexing, surveillance in public spaces, among others. In this paper we propose a CNN{LSTM architecture. First, a pre-trained VGG16 convolutional neuronal networks extracts the features of the input video. Then, a LSTM classi es the video in a particular class. To carry out the training and the test, we used the UCF-11 dataset. Evaluate the performance of our system using the evaluation metric in accuracy. We apply LOOCV with k = 25, we obtain ~ 98% and ~ 91% for training and test respectively.Sociedad Argentina de Inform谩tica e Investigaci贸n Operativ

    CNN-LSTM Architecture for Action Recognition in Videos

    Get PDF
    Action recognition in videos is currently a topic of interest in the area of computer vision, due to potential applications such as: multimedia indexing, surveillance in public spaces, among others. In this paper we propose a CNN{LSTM architecture. First, a pre-trained VGG16 convolutional neuronal networks extracts the features of the input video. Then, a LSTM classi es the video in a particular class. To carry out the training and the test, we used the UCF-11 dataset. Evaluate the performance of our system using the evaluation metric in accuracy. We apply LOOCV with k = 25, we obtain ~ 98% and ~ 91% for training and test respectively.Sociedad Argentina de Inform谩tica e Investigaci贸n Operativ

    BiLSTM with CNN Features For HAR in Videos

    Get PDF
    El reconocimiento de acciones en videos es actualmente un tema de inter茅s en el 谩rea de visi贸n por computadora debido a sus potenciales aplicaciones tales como indexaci贸n en multimedia, vigilancia en espacios p煤blicos, entre otras. En este trabajo proponemos una arquitectura CNN-BiLSTM. Primero, una red neuronal convolucional VGG16 previamente entrenada extrae las caracter铆sticas del video de entrada. Luego, un BiLSTM clasifica el video en una clase en particular. Evaluamos el rendimiento de nuestro sistema utilizando la precisi贸n como m茅trica de evaluaci贸n, obteniendo 40.9% y 78.1% para los conjuntos de datos HMDB-51 y LTCF-101 respectivamente.Sociedad Argentina de Inform谩tica e Investigaci贸n Operativ

    CNN鈥揕STM con mecanismo de atenci贸n suave para el reconocimiento de acciones humanas en videos

    Get PDF
    Action recognition in videos is currently a topic of interest in the area of computer vision, due to potential applications such as: multimedia indexing, surveillance in public spaces, among others. Attention mechanisms have become a very important concept within deep learning approach, their operation tries to imitate the visual capacity of people that allows them to focus their attention on relevant parts of a scene to extract important information. In this paper we propose a soft attention mechanism adapted to a base CNN鈥揕STM architecture. First, a VGG16 convolutional neural network extracts the features from the input video. Then an LSTM classifies the video into a particular class. To carry out the training and testing phases, we used the HMDB-51 and UCF-101 datasets. We evaluate the performance of our system using accuracy as an evaluation metric, obtaining 40,7 % (base approach), 51,2 % (with attention) for HMDB-51 and 75,8 % (base approach), 87,2 % (with attention) for UCF-101.El reconocimiento de acciones en videos es actualmente un tema de inter茅s en el 谩rea de la visi贸n por computador, debido a potenciales aplicaciones como: indexaci贸n multimedia, vigilancia en espacios p煤blicos, entre otras. Los mecanismos de atenci贸n se han convertido en un concepto muy importante dentro del enfoque de aprendizaje profundo, su operaci贸n intenta imitar la capacidad visual de las personas que les permite enfocar su atenci贸n en partes relevantes de una escena para extraer informaci贸n importante. En este art铆culo proponemos un mecanismo de atenci贸n suave adaptado para degradar la arquitectura CNN鈥揕STM. Primero, una red neuronal convolucional VGG16 extrae las caracter铆sticas del video de entrada. Para llevar a cabo las fases de entrenamiento y prueba, usamos los conjuntos de datos HMDB-51 y UCF-101. Evaluamos el desempe帽o de nuestro sistema usando la precisi贸n como m茅trica de evaluaci贸n, obteniendo 40,7 % (enfoque base), 51,2 % (con atenci贸n) para HMDB-51 y 75,8 % (enfoque base), 87,2 % (con atenci贸n) para UCF-101

    Stereo Parallel Tracking and Mapping for Robot Localization

    Get PDF
    This paper describes a visual SLAM system based on stereo cameras and focused on real-time localization for mobile robots. To achieve this, it heavily exploits the parallel nature of the SLAM problem, separating the time-constrained pose estimation from less pressing matters such as map building and refinement tasks. On the other hand, the stereo setting allows to reconstruct a metric 3D map for each frame of stereo images, improving the accuracy of the mapping process with respect to monocular SLAM and avoiding the well-known bootstrapping problem. Also, the real scale of the environment is an essential feature for robots which have to interact with their surrounding workspace. A series of experiments, on-line on a robot as well as off-line with public datasets, are performed to validate the accuracy and real-time performance of the developed method

    Reconocimiento de Acciones Humanas en Videos usando una Red Neuronal CNN LSTM Robusta

    Get PDF
    Action recognition in videos is currently a topic of interest in the area of computer vision, due to potential applications such as: multimedia indexing, surveillance in public spaces, among others. In this paper we propose (1) The implementation of a CNN鈥揕STM architecture. First, a pre-trained VGG16 convolutional neural network extracts the features of the input video. Then, an LSTM classifies the video sequence in a particular class. (2) A study of how the number of LSTM units affects the performance of the system. To carry out the training and test phases, we used the KTH, UCF-11 and HMDB-51 datasets. (3) An evaluation of the performance of our system using accuracy as evaluation metric, given the existing balance of the classes in the datasets. We obtain 93%, 91% and 47% accuracy respectively for each dataset, improving state of the art results for the former two. Besides the results attained, the main contribution of this work lays on the evaluation of different CNN-LSTM architectures for the action recognition task.El reconocimiento de acciones en videos es actualmente un tema de inter茅s en el 谩rea de visi贸n por computadora, debido a potenciales aplicaciones como: indexaci贸n multimedia, vigilancia en espacios p煤blicos, entre otras. En este art铆culo proponemos: (1) Implementar una arquitectura CNN鈥揕STM para esta tarea. Primero, una red neuronal convolucional VGG16 previamente entrenada extrae las caracter铆sticas del video de entrada. Luego, una capa LSTM determina la clase particular del video. (2) Estudiar c贸mo la cantidad de unidades LSTM afecta el rendimiento del sistema. Para llevar a cabo las fases de entrenamiento y prueba, utilizamos los conjuntos de datos KTH, UCF-11 y HMDB-51. (3) Evaluar el rendimiento de nuestro sistema utilizando la precisi贸n como m茅trica de evaluaci贸n, dado el balance existente entre las clases de los conjuntos de datos. Obtenemos un 93%, 91% y 47% de precisi贸n respectivamente para cada conjunto de datos, mejorando los resultados del estado del arte para los primeros dos. Adem谩s de los resultados obtenidos, la principal contribuci贸n de este trabajo yace en la evaluaci贸n de diferentes arquitecturas CNN-LSTM para la tarea de reconocimiento de acciones

    Models for Synthetic Aperture Radar Image Analysis

    Get PDF
    After reviewing some classical statistical hypothesis commonly used in image processing and analysis, this paper presents some models that are useful in synthetic aperture radar (SAR) image analysis

    Polarimetric SAR Image Segmentation with B-Splines and a New Statistical Model

    Full text link
    We present an approach for polarimetric Synthetic Aperture Radar (SAR) image region boundary detection based on the use of B-Spline active contours and a new model for polarimetric SAR data: the GHP distribution. In order to detect the boundary of a region, initial B-Spline curves are specified, either automatically or manually, and the proposed algorithm uses a deformable contours technique to find the boundary. In doing this, the parameters of the polarimetric GHP model for the data are estimated, in order to find the transition points between the region being segmented and the surrounding area. This is a local algorithm since it works only on the region to be segmented. Results of its performance are presented

    New families of polarimetric distributions for SAR images

    No full text
    En esta tesis se deriva una nueva distribuci贸n polarim茅trica para datos de radar de apertura sint茅tica (Synthetic Aperture Radar - SAR). Esta distribuci贸n se basa en el uso del modelo multiplicativo, suponiendo una ley Wishart compleja multivariada para el speckle y la ley gaussiana inversa para el backscatter. Con esta propuesta se obtiene la distribuci贸n harm贸nica polarim茅trica y, como caso particular, las distribuciones harm贸nicas para datos de intensidad y de amplitud. Se calculan los estimadores para los par谩metros que indexan estas distribuciones por el m茅todo de los momentos. Se muestra que la extracci贸n de estos par谩metros como caracter铆sticas es una forma de aumentar la informaci贸n y el poder de discriminaci贸n en el problema de clasificaci贸n de im谩genes SAR.This thesis presents the derivation of a new distribution for polarimetric Synthetic Aperture Radar (SAR) imagery. This distribution is based on the multiplicative model, assuming a multivariate complex Wishart law for the speckle and an inverse gaussian law for the backscatter. With this proposal, the harmonic polarimetric distribution and the harmonic distributions for intensity and amplitude data are obtained. Moments-based estimators for the parameters that index these distributions are derived and assessed. It is shown that the extraction of these parameters as features is a way of augmenting the information content and the discriminating power in SAR image classification.Fil:Jacobo Berlles, Julio C. A.. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina
    corecore